Simultaneous discovery of cancer subtypes and subtype features by molecular data integration

نویسندگان

  • Thanh Le Van
  • Matthijs van Leeuwen
  • Ana Carolina Fierro
  • Dries De Maeyer
  • Jimmy Van den Eynden
  • Lieven P. C. Verbeke
  • Luc De Raedt
  • Kathleen Marchal
  • Siegfried Nijssen
چکیده

MOTIVATION Subtyping cancer is key to an improved and more personalized prognosis/treatment. The increasing availability of tumor related molecular data provides the opportunity to identify molecular subtypes in a data-driven way. Molecular subtypes are defined as groups of samples that have a similar molecular mechanism at the origin of the carcinogenesis. The molecular mechanisms are reflected by subtype-specific mutational and expression features. Data-driven subtyping is a complex problem as subtyping and identifying the molecular mechanisms that drive carcinogenesis are confounded problems. Many current integrative subtyping methods use global mutational and/or expression tumor profiles to group tumor samples in subtypes but do not explicitly extract the subtype-specific features. We therefore present a method that solves both tasks of subtyping and identification of subtype-specific features simultaneously. Hereto our method integrates` mutational and expression data while taking into account the clonal properties of carcinogenesis. Key to our method is a formalization of the problem as a rank matrix factorization of ranked data that approaches the subtyping problem as multi-view bi-clustering RESULTS We introduce a novel integrative framework to identify subtypes by combining mutational and expression features. The incomparable measurement data is integrated by transformation into ranked data and subtypes are defined as multi-view bi-clusters We formalize the model using rank matrix factorization, resulting in the SRF algorithm. Experiments on simulated data and the TCGA breast cancer data demonstrate that SRF is able to capture subtle differences that existing methods may miss. AVAILABILITY AND IMPLEMENTATION The implementation is available at: https://github.com/rankmatrixfactorisation/SRF CONTACT: [email protected], [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring Gene Signatures in Different Molecular Subtypes of Gastric Cancer (MSS/ TP53+, MSS/TP53-): A Network-based and Machine Learning Approach

Gastric cancer (GC) is one of the leading causes of cancer mortality, worldwide. Molecular understanding of GC’s different subtypes is still dismal and it is necessary to develop new subtype-specific diagnostic and therapeutic approaches. Therefore developing comprehensive research in this area is demanding to have a deeper insight into molecular processes, underlying these subtypes. In this st...

متن کامل

A novel approach for data integration and disease subtyping.

Advances in high-throughput technologies allow for measurements of many types of omics data, yet the meaningful integration of several different data types remains a significant challenge. Another important and difficult problem is the discovery of molecular disease subtypes characterized by relevant clinical differences, such as survival. Here we present a novel approach, called perturbation c...

متن کامل

Consensus Molecular Subtypes of Colorectal Cancer and their Clinical Implications

The colorectal cancer (CRC) subtyping consortium has unified six independent molecular classification systems, based on gene expression data, into a single consensus system with four distinct groups, known as the consensus molecular subtypes (CMS); clinical implications are discussed in this review based on articles relevant to the CMS of CRC indexed in PubMed as well as the authors’ own ...

متن کامل

Identification of Prognostic Genes in Her2-enriched Breast Cancer by Gene Co-Expression Net-work Analysis

Introduction: HER2-enriched subtype of breast cancer has a worse prognosis than luminal subtypes. Recently, the discovery of targeted therapies in other groups of breast cancer has increased patient survival. The aim of this study was to identify genes that affect the overall survival of this group of patients based on a systems biology approach. Methods: Gene expression data and clinical infor...

متن کامل

Human Cancer Modeling: Recapitulating Tumor Heterogeneity Towards Personalized Medicine

Despite diagnostic, preventive and therapeutic advances, growing incidence of cancer and high rate of mortality among patients affected by specific cancer types indicate current clinical measures are not ideally useful in eradicating cancer. Chemoresistance and subsequent disease relapse are believed to be mainly driven by the cell-molecular heterogeneity of human tumors that necessitates perso...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 32 17  شماره 

صفحات  -

تاریخ انتشار 2016